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Abstract 

Analogues  of  linear-combinations-of-order-statistics,  or  L-estimators,  are  suggested  for 
estimating  the  parameters  of  the  linear  regression  model.  The  methods  are  based  on  linear 
combinations  of  the  p-dimensional  "regression  quantiles"  proposed  by  Koenker  and  Bassett. 
A  uniform  Bahadur-type  representation  of  regression  quantiles  is  established,  and  this  permits 
a  general  theory  of  L-estimators  based  on  regression  quantiles  including  those  with  smooth 
weight  functions.  A  leading  example  of  the  proposed  class  of  estimators  is  an  analogue  of  the 
trimmed  mean  which  seems  to  exhibit  certain  advantages  over  earlier  proposals  by  Koenker 
and  Bassett  and  Ruppert  and  Carroll.  A  brief  investigation  of  two  proposals  for  estimating  the 
covariance  matrix  of  this  estimator  is  also  reported. 
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1.  Introduction 

Analogues  of  a  broad  class  of  L-estimators  for  the  parameters  of  the  linear  regression 
model  are  proposed  and  investigated.  The  methods  are  based  on  the  "regression  quantile" 
statistics  of  Koenker  and  Bassett  (1978). 

Consider  the  linear  model 

)>i  =XiP  +  Ui  i  =  1, ...,  n  (1.1) 

where  xt  =  (1,  xi2,  -,  xip)  denotes  the  Ith  row  of  an  n  xp  design  matrix,  and  0  e  Rp  is  an 
unknown  regression  parameter.    We  will  assume  throughout  that  (uu  ...,  un)  are  independent 
with  common  distribution  function  F.   Explicit  further  assumptions  on  the  design  and  F  will 
be  introduced  below. 

The  p-dimcnsional  analogues  of  the  sample  quantiles,  introduced  in  KB  (1978)  solve  the 
problem 

n 

min  £  PoiVi  -  x{b)  (1.2) 

6€RP  ,=1 

where  pg(u)  denotes  the  "check"  function  pg(u)  =  9u+  +  (1  -  6)u~  and  u\  u~  denote  respec- 
tively the  positive  and  negative  parts  of  u.  The  set  of  such  solutions  will  be  denoted  by  Be. 
Note  that  in  the  location  model,  i.e.,  when  ,x,  =  1,  B0  is  simply  the  usual  6m  sample  quantiles 
from  the  (now  i.i.d.)  sample  (yu  •  •  • ,  yn)  from  F(y  -  /?).  The  l1  regression  problem,  (1.2)  with 
6  =  1/2,  is  also  a  familiar  special  case. 

Problem  (1.2)  may  be  formulated  as  a  linear  program  and  it  is  easily  shown  that  B9  is  the 
convex  hull  of  one  or  more  "basic"  solutions  of  the  form  bh  =  X£lyh,  where  h  indexes  p- 
element  subsets  of  {1,  2, ...,  n)  and  Xh  denotes  the  sub-design  matrix  with  rows  a:,-:  /  e  h,  and 
yh  is  the  sub-response  vector  with  coordinates  y,:  i  eh.  Thus  the  "regression  quantiles"  may 
be  viewed  as  order-statistics  corresponding  to  groups  of  p-obse nations.  And  problem  (1.2) 
serves  to  identify  a  small  number  of  "interesting"  basic  solutions,  roughly  O(n)  in  our  empirical 


experience,  out  of  the 


number  of  possible  basic  solutions.    Wu  (1986)  has  recently 


emphasized  the  fundamental  role  played  by  these  p-observation  subsets  in  the  theory  of  least- 
squares  estimation. 

Computation  of  regression  quantiles  is  treated  in  Koenker  and  d'Orey  (1985).  There,  an 
algorithm  based  on  Barrodale  and  Roberts  (1974)  ^-regression  algorithm  is  provided  to  effi- 
ciently compute  solutions  to  problem  (1.2)  for  all  8  e  [0,1].  This  may  at  first  appear  onerous, 
but  fortunately  it  is  a  straightforward  exercise  in  parametric  linear  programming,  or  sensitivity 
analysis.  Once  one  solution  has  been  identified  the  remaining,  0(n),  solutions  may  be  found 
easily  and  each  involves  essentially  one  simplex  pivot. 

An  asymptotic  theory  of  finite  linear  combinations  of  regression  quantiles  was  developed 
in  KB  (1978),  and  led  to  simple  analogues  of  the  "systematic  statistics"  of  Mosteller  (1946), 
Tukey  (1970),  Gastwirth  (1966)  and  others.  Ruppert  and  Carroll  (1980)  showed  that  a  simple 
analogue  of  the  trimmed  mean  could  be  constructed  as, 

~Pa  =  {X'WX)-lX'Wy  (1.3) 

where  W  is  a  diagonal  matrix  with  typical  element  w,  =  I(x,l3a  <  y{  <  x.A^J  where  %  denotes 
some  selection  from  B9.  This  estimator  trims  observations  on-or-below  the  a"*  and  on-or- 
above  the  l-a01  regression  quantile  plane,  and  computes  a  least  squares  estimate  based  on  the 
remaining  observations.  Ruppert  and  Carroll  established,  under  mild  conditions,  that 
\fn(fia-P)  was  asymptotically  Gaussian  with  covariance  matrix  cr(a,F)Q~1  where 
Q  =  \imn~lX'X  and  <r{a,F)  is  the  asymptotic  variance  of  the  alpha-trimmed  mean  from  a 
random  sample  on  F. 

In  simulation  experiments,  reported  briefly  in  Koenker  (.1986),  it  was  found  that  this 
trimmed  least-squares  estimator  was  rather  sensitive  to  influential  design  points,  and  exhibited 
substantial  departures  from  the  behavior  predicted  by  its  asymptotic  theory,  especially  when  p 
was  large  relative  to  n.   This  finding  motivated  the  present  investigation  into  a  considerably 


broader  class  of  L-estimators  based  on  regression  quantiles. 

Following  Serfling  (1980),  it  is  natural  to  consider  estimators  of  the  form, 

i  B 

%  -  fj(0)$(0)dB  +  £u,i8(0,)  (1.4) 

O  i=l 

A  A 

where,  as  above,  if  necessary,  we  have  adopted  a  rule  for  choosing  an  element  0(9)  from  Bg. 
Estimators  of  this  general  form  are  scale  and  reparameterization-of-design  equivariant,  see  KB 
(1978,  Thm.  3.2).  This  is  an  important  advantage  of  L-statistics  over  competing  M-estimates. 
Bickel  (1973)  proposed  analogues  of  L-estimators  for  the  linear  model  based  on  a  preliminary 
estimate,  but  they  are  computationally  complex  and  are  not  equivariant  to  reparameterization 
of  the  design.  Recently,  Welsh  (1985,  1986)  has  proposed  a  class  of  one-step  L-estimators 
which  are  equivariant  and  reasonably  easy  to  compute. 

We  will  focus  here  on  the  first  term  of  (1.4)  with  J  chosen  to  be  reasonably  smooth.  A 
leading  example  of  the  type  we  wish  to  consider  is  the  analogue  of  the  trimmed  mean, 

"a 

In  the  simulations  reported  in  Koenker  (1986)  this  estimator  performed  extremely  well,  show- 
ing considerably  less  sensitivity  to  influential  design  points  than  the  (asymptotically 
equivalent)  trimmed  least  squares  estimator. 

In  the  next  section  we  establish  a  uniform  0  (/71/,4log/z )  Bahadur-type  representation  for 
the  regression  quantile  process  approximating  y/Ji  (0(9)  -  0(9))  as  \l\fti  times  a  sum  of 
independent  random  variables  with  error  negligible  to  O (n  -1/4log«)  uniformly  in  9.  Applica- 
tions of  this  result  to  the  asymptotic  theory  of  L-statistics  like  (1.5)  are  treated  in  Section  3, 
where  we  also  discuss  the  problem  of  estimating  the  covariance  matrix  of  such  estimators. 


2.  A  Uniform  Bahadur  Representation  for  Regression  Quantiles 

We  will  assume  throughout  this  section  that  p  =  0  and  h_1  £)■*,•  =  (1,  0, ...,  0);  this  involves 
no  loss  of  generality  due  to  equivariance  considerations. 

The  following  design  conditions  are  employed: 

XI:    — X'X  =  Q  +  Qn  where  Q  is  positive  definite  and  the  maximum  eigenvalue  of  Qn  satis- 
n 

fiesA^Ce^CKrc-1/4) 
X2:    £  ll*,f  =  0(n) 
X3:    max|pc,-||  =  0(n1/4) 

X4:  condition  2.10  of  Portnoy  (1985): 

Partition  fi  =  (a,  7)  so  7  e  Rp_1.    Then  for  any  constant  a   (sufficiently  large)  there  is 
t)  >  0    such    that    for    all    ae[-a,a]    and    all    7  e  Rp_1,    7 'M (0,7)  >  tj  5(7)    where 

Mifi)  =  ExtFOdfl  and  5(7)  =  min  (Wp-  WD- 
1=1 

It  is  not  difficult  to  see  that  these  conditions  will  hold  in  typical  ANOVA  designs,  and 
will  hold  in  probability  when  the  rows  of  the  design  {.x,-:  i  =  1,  2,  •  •  •  }  form  a  random  sample 
from  a  very  wide  class  of  distributions  in  Rp.  Results  along  these  lines  are  given  in  Portnoy 
(1985),  in  particular  for  condition  X4. 

Our  condition  on  F  is  the  following: 

F:       F  has  a  density,  f,  and  for  some  e  >  0:  4>{u)  =  f  {F~l(u))  >  0  and  $'(11)  is  uniformly 
bounded  for  u  e  [e,\-e]. 

Lemma  2.1  Under  conditions  XI -4,  and  F,  for  any  £  >  0  there  is  a  K  >  0  such  that 

sup      W{6)  -  m\\  <  K  (log  n  In  )1'2  n  1 ) 


with  probability  tending  to  one. 

Proof.    Following  Portnoy  (1985)  partition  £  =  (an),  where  7  e  Rp_1  and  J9(0)  =  (a(6),  7(d)). 
Let 

6(a)  =  sup  {9  e  [0,1]:  a(9)  <  a)  (2.2) 

See  Bassett  and  Koenker  (1982)  for  further  details  on  this  estimate  of  F.  From  Lemma  2.1  of 


Portnoy  (1985), 


sup    IftWII  =  0P(logn /«)*/*  (2.3) 

£  <  6  <  l-£ 


and,  from  Proposition  2.1  there,  for  some  c  >  0, 

\d(a)-F(a)\<c/^  (2.4) 

uniformly  in  \a\  <b  =  max  {|ir_1(e)|,|F-1(l-e)|}  with  probability  tending  to  one.  Since 
f  (x)  >  0  and  is  continuous,  there  is  a  d  >  0  such  that  for  6  e  [e,  1-e],  with  probability  tending 
to  one 

6(cc  +  d/y/n)  >  F(a  +  d/y/t)  -c/y/n 

>F(a)  +  Kd/y/n  -c/y/n  (2.5) 

>F(a) 

and  similarly 

Ha-d/y/n)<F(a)  (2.6) 

where  K  =  inf{  f  (u):   \u  \  <  b  +  d/y/n).  Thus  from  the  definition  of  9,  we  have  for  9  =  F(a) 

\a-F-\9)\<d/y/Tx  (2.7) 

and  the  lemma  follows  from  (2.3)  and  (2.7).    □ 

The  main  result  of  this  section  is  the  following  uniform  Bahadur  representation  for 
y/n  $(0)  -  P(9)),  extending  a  result  of  Jureckova  and  Sen  ( 1 984). 


Theorem  2.1  Under  XI -4  and  F,  with  probability  tending  to  one,  for  any  e>0, 

yfn  (f3(6)  -  0(6))  =     -J=rk  i^  Q'1  S  xtf  -  '(".'  ^  ^(0))]  +  Oin-1'*  log  n )      (2.8) 
y/nf(F  \6))         ,=1 

uniformly  for  9  e  [e,  1  -«]. 

/Voo/.  From  KB  (1978),  $(6)  =  bh  =  AWh  if  and  only  if  for  j=l, ...,  p. 

£  ['(*  <  *.£)  -  Sltji  1  Xifo]  xi}-  €  [6-1,  6]  (2.9) 

«=i 

Thus  there  is  a  vector  v  e  Rp  with  max|Vj- 1  <  1,  for  which  0(9)  =  bh  and 

i 

II  EI/CC,  <  *,£)  -  *]x<  -  (1  -  *)  £  x,.  ||  =  0(\\Xhv\\)  (2.10) 

1=1  ieh 

Since    (1  -0)£  <P  max  ||x,||  =  0(«x/4)    and    prfcv||  <  [tr  (X;xh)?l2  =  0(n^4)    by    X3,    and 
y,-  =  x,/?(0)  -  F-1(0)  +  ",■  we  have 

||£[/(n,-  <  F-'(9)  +  Xi(f3  -  0))  -  6]Xi\\  =  0(nxlA),  (2.1 1) 

with  probability  tending  to  one,  uniformly  in  6  e  [«,1  -  e].  Let 

g(S,6)  =  £[/(«<  <  F -1(0)  +  *,-*)  -  *]*,-  (2.12) 

and  set 

T(6,6)=g(6,6)-g(0,6)  (2.13) 

and 

f(6,6)  =  T(S,6)  -  ET(S,6)  (2.14) 

Now,  for  5  6  {  6  6  R»  \  \\S\\  <  Ky/logn/n) 

Eg(6,6)  =  Y,Xi(XiS)f(F-*(6))  +  £x,(x,-*)a  f  '(F-\d*)) 

=  n(Q  +  Qn)Sf(F~1(6))  +  £ll*,f0(log  n/n)  (2.15) 

=  /!g5/(F-1W)  +  6>(log/;) 


by  X2  and  condition  F.  And  the  result  follows  by  Lemma  2.1  and  Lemma  2.2  (below)  □  . 

Proposition.        For       any       A  >  0,       K       fixed       as       in       Lemma       2.1,       and 
5&  A  =  {SeW  I  ||£||<  KyJ\ogn/n), 

P  { |  f,  |  >  \n  ^log/z }  <  2  exp{-A  log  n  (1  +  o (1 ))}  (2.16) 

Proof.  By  the  Markov  inequality,  for  t  >  0,  and  any  \n  >  0 

P{|fy|  >  A  J  <  «■**■  [Afy(0  +  M^-t)]  (2.17) 

where  A/y  (/ )  is  the  mgf  of  fy.  By  independence  of  the  u  's,  A/y(/ )  =  II  A/,y(/ )  where 

Aftf(f )  =  £  exp  {tXuViiSJ})  -  EJ&m  (2.18) 

and 

JiW)  =  /(^  <  F-1  +  Xi6)  -  flm  <  F~\e))  (2.19) 

Note  that  £7,-  =  sgn  (x,$)ft  for  ft  =  5{»,  between  F^tf)  and  F-1(0)  +  *,•*}  thus, 

Mi3(t)  =  ft  exp  {txi3(\  -  p^  sgn  (x,<5) }  +  (1  -ft)  exp  {-/  xtf  ft  sgn  (.x,5)}         (2.20) 

If/  =  CKm-1/4),  |x,y/ 1  is  bounded  by  X3,  and  since  for  5  s  A, 

ft  =  \XiS\f(F-\6*)<  c0\XiS\  -  0  (2.21) 

since  /  is  bounded.  Thus, 

log  A%(0  <  log  (1  +  2ft,xiyVel*o'l) 

<2ft(x,J/)2e|l'>"  (2.22) 

<  c  \XiS\(Xijtf  exp  {5/n1/4} 

for  some  constant  c,  by  (2.21).  Therefore  by  condition  X2,  for  t  >  0,  and  /  =  0(/z-1/4), 


log  A/y(0  <  £c '  M  ll*.f  '2  exp  {5m1/4} 

'=1        (2.23) 

<  c  "yjn  log  n  t2  exp  {5m1/4} 


Finally,  by  (2.17),  with  t  =  ir1'4 


8 

P{\T,- 1  >  A  h1/4  log  n)  <2  exp  {-A  log  n  +  c  "J\ogneB) 

3  (2.24) 

=  2  exp  {-A  log  « ( 1  +o(l))} 

Lemma  2.2  Under  XI -4  and  condition  F,  with  probability  tending  to  one, 

sup       \\TW(9)- 0(9),  9)\\  =  0(n^  log  n)  (2.25) 

Proof.  First  let  6{  =  e  +  i  /n3-  i  =  0,  1,2,  ,...,[(1  -2e)  n3]  and  let  &,  €  A  be  the  centers  of  spheres 
of  radius  /z~°  covering  A.  Let 

B  =  {(6,8)\9  =  eit  5  =  6j  for  some  i  and  j)  (2.26) 

Then  #B  <  an3n3p ,  and,  hence,  from  the  proposition 

P{sup  \Tj-(6,6)\  >  Op  +  5)n^4  log  n)  <  an3?  +3  e^3p  +4>lo6»— 0  (2.27) 

Consequently,  as  n  — ►  oo 

P  {sup  \\f  (6,S)\\  >p  Op  +  5)  n  !/4  log  n  }  ->  0  (2.28) 

Now  for  {S^}  c  A,  with  ||5,  -  52||  <  n"3  and  {0lt  02}  c  [e,l  -  e]  with  \d1-92\  <  n-3  consider 

|T(Ml)  "  HWII  =  IE*  ['("<  <^_1(^i)  +  *.A)  -  '("<  <  F-lW2)  +  *,-*3)]|| 

(2.29) 


Note  that  since  /  is  continuous  and  strictly  positive  | ^"— 1(^1) —  F~x{92)\  <cln~3  and  hence, 

(2.30) 
Now  for  /'  i-  j 


\{F~l{9)  +  xJJ  -  (P-x(0)  +  x{S2)\  <  cjn3  +  ctfi^n-* 

<  c3n~2h 


<C4n~2* 


(2.31) 


since  /  is  bounded.  Hence 


P  {min  \ut  -  u j  |  <  c3n  "J  <  n(n  -  \)c3n  2i  -»  0  (2.32) 

•73' 


It  follows  that  with  probability  tending  to  one,  the  term  in  square  brackets  in  (2.29)  is  nonzero 
for  at  most  two  values  of  /;  and,  hence, 

Hr^,  6J  -  r(02,  S2)\\  <  2  max  ||.x,||  =  0("1/4)-  (2.33) 

Also,  from  (2.29)  and  (2.21) 

\\ET(9U  6,)  -  ET(92_  62)\\  <£  ||x,||  \Pi(9lt  «i)  -  P,-(*3.  «<3)l 

<  £  1*1  I x^  |  |/  (F-H9J  -  f  {F~\92) |  (134) 


+  SW/(H«,)) 


xA\  -  \Xis2\ 


Since  /  '(x)  is  bounded  the  first  term  above  is  0(n(\og  n)/n1/*)n~a  =  0(1).  And  since 


|.*AI  -  \x,S2\ 


<  Mh-W  <"1/4«^  (2.35) 


the  second  term  is  also  bounded.  Hence  the  left  side  of  (2.34)  is  bounded,  and  thus, 
\\f(9lJ1)-T{62J2)\\  =  0{nlli)  uniformly  for  {9U92)  c  [e,l  -  e]  with  \9l-92\<n^  and 
{6:  52}  c  A  with  11$!  -  52||  <  n'3  (with  probability  tending  to  one).  Thus,  using  (2.27) 

/>{sup{||7W)||:  6e[e,l  -e],8eA)>b  n1'4  log  «}-»  0  (2.36) 

Therefore,  by  Lemma  2.1,  (2.25)  follows  D  .  3.  L-estimators  for  the  Linear  Model 

Smooth  L-estimators  for  regression  may  be  expressed  as 

i 

0  =  f  J{6)k8)d9  (3.1) 

o 

and  the  results  of  the  previous  section  immediately  yield 

Theorem  3.1  Under  conditions  of  Section  2,  let  J(9)  denote  a  bounded,  measurable  func- 
tion on  [0,1],  and  suppose  there  is  an  e,  satisfying  condition  F,  such  that  J{  )  vanishes  outside 
[e,  1  -  e]  then 

L(y/H0-  /?(/,  F)))  ->  N(0,  o*(J,  F)Q->)  (3.2) 

i 
where  /?(/,  F)  =  j  p{9)J{8)d9  and 
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1  1 
a2  (/,  F)  =  ff(s  At  -  ts)[f  (F~\t))f  {F-\s))TlJ(t)J(s)  ds  dt  (3.3) 

0  0 


Proof.  Theorem  2.1  implies  that 


^0  -  0(J,F))  =  -^Q-^XiVi  +  0P("-1/4  log  n )  (3.4) 


where 


",•  =  jJifitf  (F-Km-Vb'i  <  F~l(S)  -  &]  d6  (3.5) 

o 

00 

[or  w,-  =  f  J(F(v))[I(iij  <  v)  -  F(v)]dv].    The  wt  are  iid  random  variables  with  mean  zero 

-oo 

and  variance  <P{J  ,F).  Conditions  XI  and  X3  and  the  Lindeberg-Feller  CLT  immediately 
yield  3.2.  □ 

Remark.  An  intriguing  special  case,  not  covered  by  this  result  is  the  "untrimmed  mean," 

i 

0O  =  Jkd)  dd  (3.7) 

o 

Under  further  conditions  on  the  tail  behavior  of  F,  it  is  natural  to  conjecture  that  0O  would 
have  the  same  limiting  behavior  as  the  least  squares  estimator.  The  least  squares  estimator 
may  be  written  as 

0  =  S  ">A  (3.8) 


where  bh  =  Xh1yh  as  in  Section  1  and  wh  =  \Xh\2/Y,  \Xh  |2,  and  the  sums  are  over  all 


possible  /z's.  (See  Wu  (1986)  for  further  detail  on  this  result.)  Thus  while  every  subset  of  p 
observations  gets  positive  weight  in  (3.8),  the  asymptotically  equivalent  form  (3.7)  places  posi- 
tive weight  on  the  much  smaller  subset  of  bh's  which  solve  problem  (1.2).  Thus  it  may  be 
advantageous  to  resample  from  0(d)  along  the  lines  recently  discussed  by  Wu  (1986)  to  imple- 
ment bootstrap  methods  for  regression. 
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Natural  estimates  of  the  asymptotic  covariance  matrix  ^(J ,  F)  may  be  constructed  in 
several  ways.  One  approach  is  to  substitute  the  empirical  distribution  of  the  residuals  in  the 
expression  (3.3)  or  the  equivalent  form, 

oo     oo 

oV,  F)=  J  /  [F(x  A.v)  -  F(x)F(y)]J(F(x))J(F(y))  dx  dy.  (3.9) 


-oo  -co 


Welsh  (1986)  derives  a  convenient  form  of  this  general  expression  by  integrating  by  parts.  An 
alternative  approach  to  estimating  c^iJ,  F)  is  to  employ  the  empirical  quantile  function 

£y(0)  =  inf{jc  b\b  G$e)  (3.10) 

which  arises  naturally  from  problem  (1.2).  Here  x  is  the  mean  design  row,  i.e.,  rt-1^]*,-.  See 
Bassett  and  Koenker  (1982,  1986)  and  Portnoy  (1985)  for  further  details  on  Qy(B).  It  suffices 
here  to  note  that  under  considerably  milder  conditions  than  those  of  Section  2,  Qy(&)  is 
strongly  consistent  for  Q{6)  =  x/3  +  F^id),  which  may  be  interpreted  as  the  conditional  quan- 
tile function  of  the  response  variable  evaluated  at  the  mean  design  point. 

We  have  investigated  both  approaches  in  the  important  special  case  of  trimming.  The 
asymptotic  variance  of  the  trimmed  regression  quantile  estimator  given  in  (1.5)  is,  when  F  is 
symmetric,  the  Winsorized  variance, 

l-a 

^(a,  0  =  (1  -  2a)-2[  /  f(d)d9+  a^(a)  +  (1  -  a)£2(l  -  <*)]  (3-1  D 

a 

where  £(0)  =  F-1(0)-  The  simplest  approach  to  estimating  (^(a,  f)  is  simply  to  replace  £  in 
(3.1 1)  by  the  recentered  estimate, 

m  =  Qy(e)-x~Pa.  (3.12) 

We  will  denote  this  estimator  as 

s§(a)=o2(a,&.  (3.13) 
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De  Jongh  and  de  Wet  (1986)  have  investigated  several  estimators  of  (3.1 1)  based  on  resi- 
duals from  the  trimmed  least  squares  estimator.  A  slight  variant  of  their  most  successful 
method  is, 

sHcc)  =  (1  -  2a)-\(n  -  p^r?  /(f(a)  <  rt  <f(l  -  a)) 

*,  *,  (3.14) 

+  af(a)  +  (1  -a)f(l  -a) 

where  r,  =  y,-  -  x,-J9a,  and  f(0)  is  given  in  3.12. 

To  compare  the  performance  of  the  two  estimators  we  have  conducted  a  small  monte- 
carlo  experiment  along  the  lines  developed  by  Gross  (1977).  Since  0a  is  translation  equivari- 
ant,  and  s§(a),  sfia)  are  scale  equivariant,  we  can  exploit  Gross's  monte-carlo  swindle  for 
error  distributions  from  the  normal/independent  family.  Given  a  design  matrix  X,  we  draw  y{ 
=  Uf  =  Zi/V{,  i  =  1,  2,  ...,  n,  where  the  z{  are  independent  standard  normal  and  the  v,  are 
independent  root  chi-squared  random  variables  divided  by  degrees  of  freedom.  Thus  the  u{ 
are  i.i.d.  Student  random  variables  and  we  may  compute  the  optimal  weighted  least  squares 
estimate  %  =  (X'WX)~lX'Wy,  with  W  =  diag  (vf2).  Then,  as  in  Gross  (1977),  for  any  linear 
contrast  a  =  c'0, 

P  {a  >  ks{)  =  1  -  QdkSi  -  a  +  a)/ac) 

and  by  symmetry  considerations, 

P  (a  >  kSi)  =  *((-/c5,-  -  a  +  a)/ac) 

where  $  is  the  standard  normal  distribution  function,  a  =  c'0,  and  ac  =  c  '(X'WX^c .  We 
average  these  two  probabilities  over  a  number  of  replications  of  the  experiment  for  several 
values  of  k,  yielding  estimates  p(kt),  i  =  1,  ...,  k.  Logit  (p)  is  then  regressed  on  k  and  we 
interpolate  in  logit  (p)  to  find  k '  such  that  p{k')^  .025. 

Expected  confidence  interval  lengths  (ECILs)  may  be  estimated  by  averaging 
s/((y  -  X/3)'W(y  -  Xp))i  over  monte-carlo  replications  and  finally  multiplying  by  2k'  times 
the  factor 
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Ecr  =  E((y  -  xfo'Wiy  -  Xp)? 

=  V^T((fl  -p  +  l)/2)/T((«-p)/2). 

There  are  27  different  experimental  configurations.  The  factors  are 

Design:  The  X  matrix  is  drawn  at  random  once  for  each  configuration  and  fixed  over 

experimental  replications.  The  first  column  of  X  consists  of  ones,  the  remain- 
ing columns  consist  of  i.i.d.  draws  from  a  Student's  t  distribution  with  1,  3, 
and  oo  degrees  of  freedom.  The  design  matrix  X  is  then  orthogonalized  for 
each  configuration. 

Errors:  The  error  distribution  is  also  chosen  to  be  Student's  t  with  1,  3,  and  oo 

degrees  of  freedom. 

Sample  Size:       The  sample  size  is  chosen  to  be  25,  50,  and  100. 

All  other  factors  are  fixed  over  experimental  replications:  p  =  3  parameters  are  estimated 
in  every  case,  ten  percent  trimming  is  applied,  and  the  linear  contrast  employed  was 
c  =  (v^3,  \fi>,  \/3y.  The  experiment  was  conducted  entirely  with  the  'S'  system  of  Becker  and 
Chambers  (1984).  1000  replications  were  preformed  for  each  configuration.  An  'S'  macro  to 
compute  results  for  a  given  configuration  is  available  on  request.  The  random  number  genera- 
tor used  is  the  'S'  portable  implementation  of  the  Marsaglia  uniform  generator  and  thus,  recal- 
ling the  seeds  used  in  the  experiment,  results  should  be  reproducible  on  any  machine  support- 
ing this  generator. 

In  Table  3.1  we  report  estimated  5%  critical  values  for  a  two-tailed  test  on  the  specified 
linear  contrast.  Results  are  reported  for  both  sQ  and  sx  and  the  former  yields  consistently 
slightly  smaller  critical  values.  The  estimated  critical  values  for  the  n  =  25  cases  are  some- 
what larger  than  one  would  be  led  to  expect  from  a  naive  t-table  inspection,  however,  it  is 
only  in  the  extreme  case  of  Cauchy  response  and  Cauchy  design  where  the  discrepancy  is  sub- 
stantial. This  point  is  reinforced  by  examining  the  results  for  larger  sample  sizes.  Standard 
errors  for  the  elements  of  Table  3.1  are  approximately  .01,  but  particularly  for  the  Cauchy 
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design  cases  it  should  be  emphasized  that  the  results  are  conditioned  on  the  initial  draw  of  the 
design. 

In  Table  3.2  we  report  ECIL's  for  each  of  the  experimental  configurations.  Consistently 
the  scale  estimate,  s0,  based  on  the  regression  quantile  function  yields  slightly  shorter  intervals 
than  Sx,  the  estimate  employing  residuals. 

In  the  case  of  both  tables  the  results  compare  favorably  with  those  of  Gross  for  the 
bisquare  m-estimator.  They  suggest  that  reliable  hypothesis  testing  and  confidence  interval 
estimation  is  possible  for  the  trimmed  regression  quantile  estimator  with  modest  sample  sizes. 
Further  investigation  is  clearly  needed  to  suggest  methods  for  improving  on  the  simple 
methods  studied  here.  The  bootstrapping  suggestions  of  deJongh  and  deWet(1986)  provide  a 
natural  alternative  approach. 


15 


Table  3.1 


ESTIMATED  CRITICAL  VALUES 


Response 

Design  Distribution 

Distribution 

Normal 

Student (3) 

Cauchy 

sample  size  =  25 

Normal 

*o 

2.21 
2.31 

2.22 
2.32 

2.33 
2.53 

«1 

Student (3) 

So 

2.10 

2.22 

2.38 
2.53 

2.54 
2.73 

Si 

Cauchy 

s0 

1.83 
2.00 

2.30 
2.60 

2.28 
2.48 

Si 

sample  size  =  50 

Normal 

So 

2.09 
2.14 

2.09 
2.14 

2.14 
2.18 

Si 

Student (3) 

So 

2.02 
2.06 

2.08 
2.12 

2.33 
2.38 

Si 

Cauchy 

So 

1.80 
1.88 

1.96 
1.96 

3.07 
3.07 

Si 

sample  size  =  100 

Normal 

So 

2.01 
2.03 

2.02 
2.04 

2.06 
2.08 

Si 

Student (3) 

So 

1.99 
2.01 

2.01 
2.02 

2.26 
2.29 

Si 

Cauchy 

So 

1.85 
1.88 

1.91 
1.95 

2.99 
3.02 

Si 
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Table  3.2 


EXPECTED  CONFIDENCE  INTERVAL  LENGTHS 


Response 

Design  Distribution 

Distribution 

Normal 

Student (3) 

Cauchy 

sample  size  =  25 

Normal 

So 

4.36 
4.44 

4.36 
4.44 

4.36 
4.44 

Si 

Student (3) 

s0 

6.20 
6.30 

6.20 
6.30 

6.20 
6.30 

Si 

Cauchy 

So 

17.84 
18.10 

17.84 
18.10 

17.40 
18.10 

Si 

sample  size  =  50 

Normal 

So 

4.19 
4.22 

4.19 

4.22 

4.24 
4.26 

Si 

Student (3) 

So 

5.19 
5.23 

5.36 
5.41 

5.85 
5.88 

Si 

Cauchy 

So 

Si 

10.10 
10.24 

10.62 
10.73 

15.92 
15.80 

sample  size  =  100 

Normal 

So 

4.08 
4.09 

4.11 
4.12 

4.16 
4.16 

Si 

Student (3) 

So 

5.12 
5.13 

5.14 
5.15 

5.70 
5.72 

Si 

Cauchy 

So 

Si 

9.21 
9.28 

9.41 
9.47 

14.10 
14.08 
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